Cooperative behavior acquistion by asynchronous policy renewal that enables simultaneous learning in multiagent enviroment

نویسندگان

  • Shoichi Ikenoue
  • Minoru
  • Koh Hosoda
چکیده

This paper presents a method for simultaneous learning in multiagent environment to emerge the cooperative behaviors. Each agent has one policy and one action value function: the former is for action execution based on the the action value function updated in the previous stage, and the latter is for learning based on the episodes experienced by the 2-greedy method. This makes all agents behave based on the fixed policies, by which the non-Markovian problem can be avoided except for the update periods that depends on the learning progress of each agent. In order to avoid the local maxima due to such asynchronous renewal of action value functions, optimistic action values are given as initial ones, that helps the exploration process not to be trapped in the local maxima. The experimental results applied to one of the cooperative task in dynamic, multiagent environment, RoboCup, is shown and a discussion is given.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modular Learning Systems for Behavior Acquisition in Multi-Agent Environment

There has been a great deal of research on reinforcement learning in multirobot/agent environments during last decades1. A wide range of applications, such as forage robots (Mataric, 1997), soccer playing robots (Asada et al., 1996), prey-pursuing robots (Fujii et al., 1998) and so on, have been investigated. However, a straightforward application of the simple reinforcement learning method to ...

متن کامل

Learning to Communicate and Act in Cooperative Multiagent Systems using Hierarchical Reinforcement Learning

In this paper, we address the issue of rational communication behavior among autonomous agents. We extend our previously reported cooperative hierarchical reinforcement learning (HRL) algorithm to include communication decision and propose a new multiagent HRL algorithm, called COM-Cooperative HRL. In this algorithm, at specific levels of the hierarchy, called cooperation levels, a group of sub...

متن کامل

Multiagent Coordination in Cooperative Q-learning Systems

Many reinforcement learning architectures fail to learn optimal group behaviors in the multiagent domain. Although these coordination difficulties are often attributed to the non-Markovian environment created by the gradually-changing policies of concurrently learning agents, a careful analysis of the situation reveals an underlying problem structure which can cause suboptimal group policies ev...

متن کامل

Decentralized and Cooperative Multi-Sensor Multi-Target Tracking With Asynchronous Bearing Measurements

Bearings only tracking is a challenging issue with many applications in military and commercial areas. In distributed multi-sensor multi-target bearings only tracking, sensors are far from each other, but are exchanging data using telecommunication equipment. In addition to the general benefits of distributed systems, this tracking system has another important advantage: if the sensors are suff...

متن کامل

A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem

Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003